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ABSTRACT 

Batch  maximum  likelihood  (ML)  and  maximum  a  posteriori  (MAP)  estimation  with  process  noise  is  now  more  than 
thirty-five  years  old,  and  its  use  in  multiple  target  tracking  has  long  been  considered  to  be  too  computationally  intensive 
for  real-time  applications.  While  this  may  still  be  true  for  general  usage,  it  is  ideally  suited  for  special  needs  such  as 
bias  estimation,  track  initiation  and  spawning,  long-term  prediction  of  track  states,  and  state  estimation  during  periods  of 
rapidly  changing  target  dynamics.  In  this  paper,  we  examine  the  batch  estimator  formulation  for  several  cases:  nonlinear 
and  linear  models,  with  and  without  a  prior  state  estimate  (MAP  vs.  ML),  and  with  and  without  process  noise.  For 
the  nonlinear  case,  we  show  that  a  single  pass  of  an  extended  Kalman  smoother-filter  over  the  data  corresponds  to  a 
Gauss-Newton  step  of  the  corresponding  nonlinear  least-squares  problem.  Even  the  iterated  extended  Kalman  filter  can 
be  viewed  within  this  framework.  For  the  linear  case,  we  develop  a  compact  least  squares  solution  that  can  incorporate 
process  noise  and  the  prior  state  when  available.  With  these  new  views  on  the  batch  approach,  one  may  reconsider  its 
usage  in  tracking  because  it  provides  a  robust  framework  for  the  solution  of  the  aforementioned  problems.  Finally,  we 
provide  some  examples  comparing  linear  batch  initiation  with  and  without  process  noise  to  show  the  value  of  the  new 
approach. 

Keywords:  Batch  ML  Estimation,  Batch  MAP  Estimation,  Nonlinear  Least  Squares,  Track  Initiation  and  Spawning, 
Extrapolation 


1.  INTRODUCTION 

Batch  maximum  likelihood  (ML)  and  maximum  a  posteriori  (MAP)  estimation  with  process  noise  is  now  more  than  thirty- 
five  years  old,1’ 2  but  its  use  in  multiple  target  tracking  has  long  been  considered  to  be  too  computationally  intensive  for 
real-time  applications.  While  this  may  still  be  true,  significant  advances  over  the  last  twenty  years  in  nonlinear  least- 
squares,  nonlinear  optimization,  corresponding  linear  algebra  techniques,  and  computing  power  leads  one  to  re-examine 
its  use  for  special  problems  in  multiple  target  tracking. 

While  “batch  ML/MAP  estimation  with  process  noise”  is  not  intended  to  replace  the  extended  Kalman  filter  (EKF),  the 
unscented  Kalman  filter  (UKF),  the  interacting  multiple  model  (IMM)  algorithm,  or  variable-structure  interacting  multiple 
model  (VSIMM)  filtering  techniques  in  general,  it  may  improve  these  techniques  for  estimation  problems  that  are  highly 
nonlinear  relative  to  a  given  measurement  update  rate  or  signal-to-noise  ratio.  (The  performance  of  the  extended  or 
iterated  extended  Kalman  filter  (or  the  unscented  Kalman  filter)  may  degrade  significantly  in  such  cases.)  Some  potential 
areas  in  which  better  optimization  and  estimation  may  improve  performance  significantly  are  (1)  estimation  of  sensor  and 
navigation  biases,  (2)  long-term  state  prediction,  (3)  track  initiation  and  spawning,  and  (4)  estimation  during  periods  of 
rapidly  changing  dynamics  (i.e.,  maneuvers).  Also,  in  multiple  frame  assignment  (MFA)  or  multiple  hypothesis  tracking 
(MHT)  approaches,  one  employs  a  moving  window  over  which  it  is  natural  to  use  batch  ML  and  MAP  estimation.  Thus, 
even  though  this  approach  is  not  intended  to  replace  extended  Kalman  filtering  or  interacting  multiple  models  (IMM),  it 
may  be  an  efficient  method  for  both  monitoring  the  success  of  these  methods  and  providing  a  robust  numerical  technique 
to  correct  some  of  their  difficulties. 

This  paper  is  organized  as  follows.  In  Section  2,  the  “batch  ML/MAP  estimator  with  process  noise”  is  reviewed  for 
the  nonlinear  problem.  First,  the  MAP  and  the  ML  formulations  are  given,  then  the  relationship  to  the  iterated  Kalman 
filter  and  to  the  iterated  Kalman  smoother-filter  is  explained.  Section  3  addresses  the  “batch  ML/MAP  estimator  with 
process  noise”  for  the  linear  problem.  A  new  compact  formulation  is  given  for  the  batch  estimator  that  handles  both  cases 
with  and  without  a  prior  state  estimate. 
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2.  ML  AND  MAP:  THE  DISCRETE  NONLINEAR  PROBLEM 


It  is  known  that,  with  Gaussian  a  priori  statistics,  the  maximum  a  posteriori  estimate  is  equivalent  to  an  appropriate  least- 
squares  curve  fit,  using  the  inverses  of  the  plant-  and  measurement-noise  covariances  as  weighting  matrices.  This  section 
shows  a  development  of  the  appropriate  cost  function  and  the  least  squares  problem.  Such  a  formulation  can  also  be  posed 
as  a  discrete  or  continuous  nonlinear  two-point  boundary-value  problem.  The  development  in  this  section  follows  closely 
that  in  the  book  by  Jazwinski1  and  the  book  by  Sage  and  Melsa.2 

2.1.  The  model  and  statistical  formulation 

The  discrete  time  state  and  observation  models  are  given  by 

xk+ 1  /fc(^fc)  +  TfcWfcj 

Zk  =  hk{xk)  +vk , 


(1) 

(2) 


where 


xi;  =  n-dimensional  state  vector, 
fk(xk)  =  n-dimensional  vector-valued  function, 
Tfc  =  n  x  r  matrix, 

Wk  =  r-dimensional  plant-noise  vector. 


zk  =  m-dimensional  observation  vector, 
hkixk  )  =  TO-dimensional  vector- valued  function, 
vk  =  TO-dimensional  observation-noise  vector, 


For  the  discrete-estimation  model,  wk  and  vk  are  assumed  to  be  independent  zero-mean  Gauss-Markov  white  sequences 
such  that 


E{wkwJ}  =  Qkh-j,  E{vkvj}  =  RkSk-j,  E{x0wl}  =  0,  E{x0vl}  =  0,  x0  ~AT(x0,Po), 


where  5k-j  is  the  Kronecker  delta  function,  and  Qk  and  Rk  are  nonnegative  definite  rxr  and  to  x  to  covariance  matrices, 
respectively.  The  sequences  xo,  xi, xjv  and  z\.z2.  ■■■■,  Zn  are  denoted  by  Ajv  and  Zjy ,  respectively.  We  further  assume 
that  p(x o)  is  known  and  is  normal  with  mean  Xq  and  covariance  Pq.  We  shall  assume  that  all  functions  are  sufficiently 
smooth  so  that  up  to  two  derivatives  exist  and  are  continuous.  Also,  it  is  typically  assumed  that  the  time  of  the  prior  is 
such  that  f0  <  1 1 . 

The  best  estimate  of  x  throughout  an  interval  will,  in  general,  depend  on  the  criteria  used  to  determine  the  best  estimate. 
Here  the  term  “best  estimate”  denotes  that  estimate  derived  from  maximizing  the  conditional  probability  function  p(X  |  Z) 
as  a  function  of  x  throughout  that  interval,  and  is  known  as  the  maximum  a  posteriori  estimator.  The  following  derivation 
is  adapted  from  that  found  in  the  book  by  Sage  and  Melsa3  and  is  repeated  here  for  completeness. 

Bayes’  rule  applied  to  p(X R  \  Z R)  yields 


p(XN  |  ZN) 


p(ZN  |  XN)  p{XN) 
p(Zn) 


(3) 


If  vk  is  Gaussian  and  xk  is  known,  Eq.  (2)  implies  p(zk  \  xk)  is  Gaussian;  so  that,  if  X^  is  given. 


N 


p{ZN  I  XN)  = 

fc=l 


exP  {~h(zk  -  hk{xk))TRk1{zk  -  hk(xk))} 
( 2n)m /2  det(f?fe)5 


(4) 


Using  the  chain  rule  for  probabilities,  p(a,  /3)  =  p(a  |  /3)  p(/3),  results  in 

p{XN)  =  p(x jv  |  A"jv_i)  p(xN- 1  |  XN_2)  ■  --p(x i  |  x0)  p(x0).  (5) 


Since  wk  is  a  white  Gauss-Markov  sequence,  xk  is  Markov  and  p(xk  \  Xk_i )  =p(xk  |  xk-i).  Thusp(Ajv)  is  composed 
of  Gaussian  terms,  and 


JV 

p(XN)=p(x0)  p{xk  |  Xfc_i), 

fc=i 


(6) 


where  p(xk  \  xk-\)  is  Gaussian  and,  from  equation  (1),  has  mean /fc_i(xfc_i)  and  covariance 

1  —  Tfe— 1  Q k— i  r*,_i. 


p(Z]y)  contains  no  terms  in  xk  since  ZN  is  the  known  conditioning  variable.  Thus  p(ZN  )  can  be  considered  a  normalizing 
constant  with  respect  to  the  intended  maximization.  Analytically, 


p(xk  |  Xk-i ) 


exp{-l(xfc  -  /fc-l(fffc-l))T^fc-iQcfc  -  fk-l(Xk-l))} 
(2-7t)"/2  det(f2fc_i)5 


(7) 


Summarizing,  we  have 

,  ,  p{ZN  |  XN )  p(x jv)  p(x0)  Ft  .  .  w  ,  X 

I  Zjv)  = - - =  .  ,  II  p(zk  I  xk)p[xk  I  Xfc_i) 

_  £(go)  tt  exp  -  hk(xk))TRF(zk  ~  hk{xk))} 

p(Zn)  ~~  (27t)™/2  det(i?fc)5 

exp{-|(xfc  -  fk-i{xk-i))Tttk-i(xk  ~  fk-i(xk-i))} 

(27t)"/2  det(flfc_i)5 

=  Aexpf--||x0  -xo||p-i  -  2  _  hk{xk) ll^-i  -  2  H  11**  “  /fe-i(a;fc-i)llfi-^i  ) 

'  °  fe=l  fe  Z  k= 1  k  1 ' 

where  A-1  =  p(ZN)  n£li(27r)  ™/2  det(f?fc)2  (27t)"/2  det(0fe_i)2 .  Assuming  that  Tfc_i  (and  thus  SI fc _  1 )  does  not 
depend  on  the  state,  the  problem  of  maximizing  p(Xjy  |  Zj\ )  is  equivalent  to 

1  1  N  1  N 

Minimize  J  =  -||x0  -  x0||p-i  +  -  ^  || zk  -  hk(xk)\\R-i  +  -  ^  \\xk  -  /*_i (a:*_i)||^_i  .  (8) 

k  -l  k  1 

We  further  examine  this  expression  for  the  two  cases  of  (with  and  without)  the  prior  in  the  following  subsections. 

2.1.1.  A  second  formulation  of  the  MAP  estimation  problem 

The  nonlinear  least  squares  formulation  for  the  MAP  estimate  is  that  shown  in  (8).  An  equivalent  formulation,  established 
using  the  Moore-Penrose  inverse  of  (!/,._  1 ,  is 


1  1  .  .  1  , 

Minimize  J  =  -poP^p,  0  +  ^  ^vkRklyk  +  (9) 

k= 1  k  -l 

Subject  to  xk  =  fk-i(xk-i)  +  Tfe_iw;fc_i  k  =  l,...,N 
zk=hk(xk)+vk  k  =  l,..-,,N 
Xq  =xq+Po- 


REMARK  1.  The  formulation  in  (9)  is  more  robust  than  the  one  in  (8)  because  when  measurements  are  closely  spaced  in 
time,  flfc  — >  0,  which  leads  to  an  ill-conditioned  optimization  problem. 

As  formulated,  the  prior  is  assumed  to  be  at  a  time  prior  to  the  first  measurement.  The  formulation  is  easily  modified 
to  account  for  a  prior  at  time  to  =  t\ .  In  this  case,  X\  =  fo  (xq)  =  Xq .  and  the  formulation  is 

1  1  N  1  N 

Minimize  J  =  —  ||tci  -  £i||p-i  +  -  ^  \\zk  -  hk(xk)\\R-i  +  -  ^  \\xk  -  /*— 1  ) ||^_i ^ . 

L  k  .1  k  L  k= 2 


(10) 


2.1.2.  A  second  formulation  of  the  ML  estimation  problem 

A  second  formulation  in  which  there  is  no  prior  can  be  extracted  from  (10)  by  assuming  that  the  density  p(x o)  on  Xo  is  a 
uniform  distribution,  i.e.,  a  noninformative  prior.  This  leads  to  the  problem 

1  N  1  N 

Minimize  J  =  -  ^  \\zk  -  hk(xk)\\2R-i  +  -  ^  \\xk  -  /fc-i (tcfe_i ) | |^_j. ^ ,  (11) 

k= 1  k= 2 

which  has  the  equivalent  formulation, 


N 


N 


Minimize  J  =  Rklvk  +  wl-iQkhwk-i 

k=l  k= 2 

Subject  to  xk  =  fk—i  {xk—i )  +  Tk-iwk-i  k  =  l,...,N 


Zk  =  hk(xk)+vk  k  =  1, 
Xq  =  Xo  +  Ho- 


,  N 


2.2.  Relation  to  the  iterated  extended  Kalman  filter  and  filter-smoother 

The  above  formulation  uses  Gaussian  noise  assumptions  and  a  ML/MAP  estimation  formulation  to  compute  the  “modal 
trajectory.”  In  this  section,  an  explanation  of  the  relationship  between  this  formulation  and  the  iterated  extended  Kalman 
filter  (IEKF)  and  iterated  extended  Kalman  filter-smoother  is  provided. 


2.2.1.  The  iterated  extended  Kalman  filter 

In  1993,  Bell  and  Cathey4  showed  that  the  iterated  extended  Kalman  filter  (IEKF)  can  be  viewed  as  a  sequence  of  Gauss- 
Newton  steps  for  a  nonlinear  least  squares  problem.  Here  is  the  explanation.  Given  xk  ~  Af(xkk_  i  •  4  ;.-_J )  and 
zk  =  hk(xk)  +  vk ,  then  each  iteration  of  the  estimate  of  xk  via  an  IEKF  is  obtained  as  a  Gauss-Newton  step  of  the 
nonlinear  least  squares  problem 


Minimize  J  =  —  ||xft 


%k\k— 1 1 


fc|fc-i 


+  h*k 


hk(xk)\\p>- 


(12) 


or  equivalently 


Minimize  J(xk) 


2.11  f Xk  || 2 

2  I'  \hk(xk)-zkJ  K-1’ 


where  Wk  =  diag (Pk\k~i,  Rk),  i.e.,  the  IEKF,  when  convergent,  computes  a  minimum  of  this  nonlinear  least  squares 
problem. 


2.2.2.  The  iterated  extended  Kalman  filter-smoother 

We  now  extend  the  comparison  to  the  case  of  the  filter-smoother.  Given  xk-i  ~  J\f(xk_i\k_i ,  -F* _ 1 1 jfe _ l ) 3  %k  = 
fk~ i(xfc-i)  +  Ejt  iwk  i,  zk  =  hk(xk )  +  vk,  the  estimation-smoothing  problem  for  {xk-i,  xk}  is  posed  as 


Minimize  J  =  —  -  xk_uk_i 


Ip-1  +  „||  zk  hk{xk)\\R-i  +  fk—1  1 )  IIq— i 

fc  —  1 1  fc  —  1  2  nh  2 


or,  equivalently. 


Y  f  %k— 1  %k— l\k  —  1  \  j 1 2 

Minimize  J{xk-X,xk)  =  -  xk  -  fk- i(xfc-i) 

2  1  hk(xk)-zk  <"w‘ 


—  1 
k- 1 


where  Wk- 1  =  diag(Pfc_1|^_1 ,  S2*_i ,  Rk).  This  corresponds  to  the  case  of  N  =  1  in  equation  (8). 
Given  an  initial  approximation  (4_r,  x°),  the  sequence  {(xk_1 ,  x\)  },>o  is  generated  via 


(13) 


(4t\,4+1) 


(4-i  >4)  +  (A4-i>  A4) 


2 


The  correction  terms  (Axlk_1,  Ax\)  are  the  solution  of  the  linear  least  squares  problem 


Minimize 


0 

I 


\  /At'  \  (Xk- 1  xk-l\k-l 

-Dxi_Jk-i(xk-i)  1  (  a^1  )  +  I** -fk-iK-i) 

0  DXkhk(xl)J  V  k  J  \  4(4)  -  zk 


(14) 


wr 


The  normal  equations  for  this  least  squares  problem  (14)  are 


4-i(fe-i  +  44i(4-i)^fc-i-4-i(4-i)  Fk-i(xk-i)^k-i 

~^k-iFk- i(4-i)  0*- 1  +Hl{x{)R-k1Hk{xi) 


=  (  fc-i|fc-i 
0 


Fk-l(xl-l)^k-l 


n*-i 


H'k(xi)Rk 


%k— l\k  —  1  %k— 1 

/fc-i(4-i)  -4 

<  Zk-hk{  x\) 


A4-i 

Ax?. 


Numerically,  one  can  solve  these  normal  equations  using  three  different  factorization  schemes:  a)  Cholesky  Factorization, 
b)  QR  factorization,  c)  singular  value  decomposition  (SVD).  We,  instead,  solve  them  directly  using  the  matrix  inversion 
lemma  (and  considerable  linear  algebra).  The  closed-form  solution  to  the  the  correction  terms  is  given  by 

A4-i  =  ixk-i\k~i  -4-i) 

+  4(4-i)4fe(4-ii4)^*  -  4(4)  -  Hk(x\)[Fk_ i(4-i)(**-i|fc-i  -  4-i)  +  (4-i(4-i)  “  4)]  j 
A x\  =  4- i(x*fc_i)(xfc_ijt,_i  -4_!)  +  (/fc_i(4-1)  -4) 

+  Kk{xtk_1,x'k)(zk  -4(4)  -#44)  [4- i(x*fe_1)(xfe_i|fe_i  -x|_J  +  ~4)]) 


where 


#44) 

4-44-t) 

4i*-i(4-i) 

#44-n4) 

4(4-i) 

4(4-i)-M4-ti4) 


DXhhk  (xk), 

DXk_1fk-l  (4-1  )> 

+  -4—1  )-Pfe— t|fc_  1-4— i  (-c* — i )  >■ 

4|fc-i(4-i)4T(4)(^(4)4+i|44-i)^(4)T  +  4)  , 
4-1|T-i4T-i(4-1)4|jt-1(4-1)“1> 

Pk-i\k-iFk-i(4.-i)Hk  (4)  (#44)4+144-1  )4T (4)  +  #*) 


The  covariance  of  the  states  xjj._i  and  xlk,  denoted  by  Pk-itk\k-i  Is  given  by 


(4-i,*i*-i(4-1>4))  1 


*— 1|*— 1 


+  4T-i(4-i)°jt-i4-i(4-i) 
— ^*_i4-i(4-i) 


-4T_  i(4_i)«41  \ 

^k-i  +  Hk  (xi)RklHk(4))  ' 


Note  that  these  equations  represent  precisely  the  iterations  in  the  extended  Kalman  filter-smoother. 1 

In  the  case  N  >  1,  we  conjecture  that  an  iteration  of  the  extended  Kalman  filter-smoother  over  several  states  corre¬ 
sponds  to  Gauss-Newton  steps  in  the  solution  of  the  nonlinear  least  squares. 


2.2.3.  Remarks  on  The  Nonlinear  Least  Squares  Approach 

Given  the  above  observations,  one  would  in  general  add  a  globalization  for  the  nonlinear  least  squares  problem.  This 
could  be  achieved  with  a  line  search  or  a  trust  region  method  (i.e.,  a  modern  form  of  the  Levenberg-Marquardt  update). 
In  addition,  one  one  can  combine  a  full  Newton  step  with  one  of  these  globalization  techniques.  We  state  without  proof 
that  the  full  Newton  step  corresponds  to  a  quadratic  Kalman  smoother-filter  pass  over  the  data. 


3.  ML  AND  MAP:  THE  DISCRETE  LINEAR  PROBLEM 


We  now  examine  the  batch  ML/MAP  estimation  formulations  for  the  case  of  linear  problems.  Relative  to  the  nonlinear 
problem,  two  aspects  are  important  in  the  linear  case:  (i)  the  least  squares  solution  is  computed  in  precisely  one  Gauss- 
Newton  step,  and  (ii)  an  efficient  compact  formulation  (with  or  without  a  prior)  is  possible  and  we  present  a  new  approach 
here.  The  linear  problem  is  important  in  tracking  because  some  nonlinear  problems  can  be  transformed  to  linear  ones  via 
the  converted  measurement5  approach. 


3.1.  The  mathematical  model 

The  discrete  process  and  observation  models  are  given  by 


1  —  Fk^k  T  Qk  +  PfeTCfc  ,  Zk  —  HkXk  +  Uk  T  Vk ,  Xq 


where 


N(x0,P0) 


Xk  =  77-dimensional  state  vector, 

Fk  =  n  x  n  matrix, 

gk  =  n-dimensional  known  input  vector, 
T*  =  n  x  r  matrix, 

Wk  =  r-dimensional  plant-noise  vector. 


Zk  =  m-dimensional  observation  vector, 

Hk  =  7Ti,  x  n  matrix, 
uk  =  777, -dimensional  known  input  vector, 

Vk  =  777, -dimensional  observation-noise  vector. 


For  the  discrete  estimation  model,  wk  and  Vk  are  assumed  to  be  independent  zero-mean  Gauss-Markov  white  sequences 
such  that  E{wkwJ}  =  QkSk-j.  E{vkvJ}  =  Rk6k-j,  E{x0wl}  =  0,  E{xovx}  =  0,  x0  ~  N(x 0,  P0)  where  6k-j  is 
the  Kronecker  delta  function,  and  Qk  and  Rk  are  positive  definite  r  x  r  and  m  x  m  covariance  matrices,  respectively. 

REMARK  2.  The  form  of  the  linear  estimation  problem  (incorporation  of  gk  and  uk)  is  intended  to  cover  the  Gauss- 
Newton  step  in  the  nonlinear  estimation  problem. 


3.2.  Batch  ML  estimation  with  process  noise 

Given  the  sequence  of  measurements  {zo,  Zi. ... .  zjy}.  the  problem  is  to  estimate  {xq.  xt .... .  xjv}  assuming  that  there 
is  no  prior.  (Note  that  the  notation  has  changed  now  from  the  previous  section  in  which  the  measurements  were  numbered 
beginning  at  one  (1)).  The  determination  of  {xo,Xi, . . . ,  xjv}  via  a  maximum  likelihood  estimation  is  the  solution  of  the 
linear  least  squares  problem 

1  i  N  i  N 

Minimize  J  =  -\\z0  -  HoXo  ~  Uoll^-i  +2^2  H Zk  ~  HkXk  ~  Uk^R~1  +  2  ^  ^ Xk  ~  _  9k-1^n~i1 

k= 1  k  1 


3.3.  Batch  MAP  estimation  with  process  noise 

Given  the  sequence  of  measurements  denoted  by  {zi, . . . ,  Zn}  at  the  times  {ti,  t2,  ■  ■  ■ ,  tjv}-  respectively,  and  assuming 
that  there  is  a  complete  or  partial  prior  at  to  <ti,  the  estimation  of  { xo ,  xi, . . . ,  xjv}  can  be  posed  as  a  linear  least  squares 
problem  (15)  in  which  the  equation 

Z0  =  H0X  o  +7i0+  770,  (16) 

where  Vq  ~  A"(0.  Rq  ') ,  can  be  used  to  cover  the  case  of  both  the  partial  and  full  priors.  Here  are  the  two  cases. 

Case  1:  Given  a  prior  Xo  ~  A'" (To •  To)  at  a  time  to  before  the  first  measurement  at  time  t\  (to  <  tf)  so  that  the  mea¬ 
surements  and  states  are  {zi, . . . ,  zn}  and  {xo,  •  ■  ■ ,  Xn}.  respectively,  then  Zq  =  HqXo  +  Uo  +  Vo  is  interpreted 
as 

Xo  =  lx  o  +  0  +  (~Po)  (17) 

i.e.,  z0  =  x0 ,  H0  =  I,  u0  =  0,  770  =  ~Po- 

Case  2:  If  the  prior  at  a  time  before  the  first  measurement  is  just  part  of  the  state,  then  one  can  proceed  as  in  the  following 
example.  Suppose  the  state  x  is  partitioned  as  x  =  [p  v]T  ,  then  Xq  =  Xq  —  //q  is  replaced  by 

(0  I)  x0  =  (0  I)  x0  -  (0  I)  p0 


3.3.1.  A  compact  least  squares  formulation 

The  first  step  in  the  derivation  of  the  compact  form  is  to  note 


xp  —  | 

where 
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-p+ i  ^ p *( 
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Then,  one  can  write 


In  vector  notation 


where 


—  Fp  ( Ei=P+ 1  ^pjrj-itt'i-i) 
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k 
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(18) 
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Up 
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3.3.2.  Solution  methods  for  the  linear  least  squares  problem 

In  this  subsection,  we  assume  that  the  given  data  is  {zo,  z  1 , . . . ,  £jv}  at  times  {to,  ft ,  ■  ■  ■ ,  tjv)  so  that  we  are  considering 
the  problem 


1  i  N  i  " 

Minimize  J  =  -\\z0  -  H0x0  -  u0 ||^_i  +  -  ^  || zk  -  Hkxk  -  uk\\2R-i  +  -  ^  \\xk  -  Fk_ kxk-i 


9k- i||q-i 


fc=i 


fe=i 


The  two  prior  cases  discussed  above  can  be  handled  within  the  current  formulation  by  appropriately  interpreting  the 
equation  Zq  =  HqXq  +  Uq  +  The  solution  xk  can  be  posed  as  that  of  a  linear  least  squares  problem. 


Minimize  \\1 iNkxk  -  ZNk\ 


wz 


(23) 


where  Wjvfe  =  E{£Nk£jfk }  is  the  weighting  or  covariance  matrix,  Q  =  diag  (Q0, . . .  ,Q jv-i)  and  R  =  diag  (R0, . . . ,  Rjy)- 
The  normal  equation  for  this  problem  is 


('HjfkWNl'HNk)xk  —'HjfkWNl£Nk-  (24) 

Assuming  (RjfkW^lRiyk)  is  nonsingular,  the  solution  to  the  normal  equations  is 

xk  =  ( ^-Nk^Nk^-Nk )  T^Nk^ Nk^Nki  (25) 

where 

Pk  =  {nTNkW^kHNk)-1  (26) 

is  identified  as  the  covariance  matrix  of  xk . 

The  weighting  matrix  W]yk  =  ['tfp9](p,9)e[o,i,...,iv]  x[o,i,...,iv]  has  the  following  form 
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where  the  computation  of  wpq  =  E{epke^k}  £  5R™X™,  which  is  a  block  symmetric  and  positive  semi-definite  matrix,  can 
be  divided  into  several  natural  cases: 


For  q  <  p  <  k 

For  q  =  p  <  k 

For  q  <p  =  k 
For  q  =  p  =  k 
For  p  >  k  >  q 

For p>  q>  k 
For  q  =  p  >  k 


wv 


wv 


=  HP(  Y, 


i=p+ 1 


IT 

1 


=  RPP  +  Hp(  Y  )Hi 

^  i=P+ 1 


Wkq  =  0 

tttfcfc  —  Ekk 

Wpq  —  0 


wv 


W„ 


=  hp(  Y  <MI'<  iQi  H  irr,)<i^)//, 


iT 
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=  Rpp+Hpl  Y  ^pdFi-iQi-ii-iFl-i^pi) Hp  ■ 

'  i=  h+1 


(28) 

(29) 

(30) 

(31) 

(32) 

(33) 

(34) 


The  shape  of  the  matrix  Wjvfc  depends  on  the  value  of  k  and  generally  separates  into  the  following  cases:  (a)  k  =  0,  (b) 
1  <  k  <  N  —  1,  (c)  k  =  N.  In  each  case,  the  kth  row  and  column  are  zero  with  the  single  exception  of  the  block  wkk. 
The  form  for  1  <  k  <  N  —  lis  given  above,  while  the  form  for  k  =  0  and  k  =  N  are  as  follows: 


/woo 

0 
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wn 

WV2  ■  ■ 
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W21 

W22 
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V  0 

WN1 
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(k  =  0) 


(35) 


/  Woo 

U>01 

WoJV-l 

0  \ 

Wi  0 

Wuv-1 

0 

WN- 10 

Wjv-11  ■ 

Wjv-IN-1 

0 

V  o 

0 

0 

wnnJ 

(k  =  N). 


(36) 


REMARK  3.  The  batch  estimation  solution  for  the  case  of  process  noise  but  no  prior  is  handled  by  deleting  the  first  row 
and  column  of  (27)  and  altering  (22)  accordingly. 


4.  SIMULATION  RESULTS 

To  verify  the  accuracy  of  the  batch  estimator  and  the  value  of  using  process  noise  and  the  prior  state  estimate  within  the 
batch  estimator,  a  simulation  was  implemented  for  the  linear  problem  using  the  estimator  in  (25).  A  constant  velocity 
single  target  was  propagated  from  an  initial  state.  Cartesian  3-D  measurements  were  generated  and  random  Gaussian 
noise  was  added  to  each  measurement  (covariance  of  R  =  100  x  I).  Gaussian  process  noise  was  also  added  to  the 
dynamic  propagation  of  the  target  (covariance  of  Q  =  100  x  I).  The  batch  filter  with  a  window  size  of  A  =  11  was 
implemented  with  the  measurement  noise  covariance  matched  to  the  actual  Gaussian  noise  covariance  used  to  generate 
the  data.  Different  cases  were  assumed  about  use  of  the  process  noise  and  the  prior  state  within  the  batch  estimator,  as 
will  be  explained. 

In  the  first  case,  shown  in  Figure  1,  the  batch  estimator  only  assumed  the  presence  of  measurement  noise  (i.e.,  no 
process  noise  and  no  prior  estimate  were  used  in  the  estimator).  The  first  figure  shows  the  position  RMSE  accuracy  versus 
the  the  index  k  £  {1, . . . ,  11}  while  the  second  plot  shows  the  velocity  RMSE  accuracy.  The  index  k  refers  to  batch 
index  (a  discrete  time  index  within  the  window)  where  the  state  estimate  is  generated.  Also  plotted  is  the  square-root  of 
the  trace  of  the  batch  estimator  covariance  matrix.  This  represents  the  “perceived”  accuracy  of  the  estimate  as  produced 
by  the  batch  estimator.  Several  important  things  should  be  noticed  in  these  plots.  First,  the  shape  of  the  position  RMSE 
implies  that  the  estimate  is  more  accurate  at  certain  time  points.  The  position  estimate  is  most  accurate  at  k  =  3  and  k  =  9 
while  the  velocity  estimate  is  most  accurate  at  k  =  6.  Meanwhile,  the  square-root  of  the  trace  of  the  covariance  is  much 
smaller  than  the  RMSE,  indicating  that  the  estimator  thinks  it  is  doing  much  better  than  it  actually  is.  This  mismatch  is 
attributed  to  the  fact  that  the  assumed  model  (just  measurement  noise)  is  not  correct. 

The  next  case  is  shown  in  Figures  2.  Here,  the  batch  estimator  was  implemented  with  both  measurement  and  process 
noise.  Comparing  Figure  1  and  Figure  2  we  see  that  the  latter  estimator  is  much  more  accurate  than  the  former  (i.e., 
including  process  noise  in  the  estimator  improves  the  accuracy).  In  addition,  the  RMSE  and  the  square-root  of  the  trace 
of  the  covariance  matrix  match,  indicating  that  the  uncertainty  reported  by  the  batch  estimator  covariance  represents 
the  actual  performance  of  the  state  estimate  (i.e.,  there  is  a  model  match).  Also  notice  that  the  estimator  accuracy  is 
significantly  worse  at  the  end  points  of  the  batch,  k  £  {1, 11},  while  it  is  roughly  the  same  for  the  internal  points, 
k  £  {2, 3, . . . ,  10}.  This  implies  that  if  one  wants  to  obtain  the  best  possible  estimate  of  the  target  state,  then  one  should 
choose  an  internal  point  in  the  batch  at  which  to  request  the  estimate.  Generally  speaking,  this  result  makes  sense  because 
a  “polynomial  curve  fit”  to  data  is  the  least  accurate  at  the  end-points  in  a  data  set. 

The  third  case  is  shown  in  Figures  3.  Here,  the  batch  estimator  is  implemented  with  measurement  and  process  noise 
and  a  prior  state  estimate.  The  covariance  of  the  prior  state  was  set  to  Po  =  100  x  I.  The  prior  state  estimate  used 
by  the  batch  estimator  was  provided  at  the  time  of  the  first  measurement,  i.e.,  k  =  1.  These  plots  show  that  the  RMSE 
accuracy  agrees  with  the  square-root  of  the  trace  of  the  covariance,  thus  the  model  match  is  achieved.  Also  notice  that 
at  k  £  {1,  2, 3}  that  the  estimator  accuracy  is  much  better.  This  is  the  influence  of  the  prior  state  (i.e.,  the  availability  of 
prior  information  improves  the  estimate  here).  As  k  increases,  the  prior  state  must  be  propagated  from  time  1 1  to  time  tj;. 
As  it  is  propagated,  the  uncertainty  grows  and  the  value  of  the  prior  information  diminishes.  At  k  =  4  we  see  that  the 
RMSE  is  equivalent  to  that  shown  in  Figure  2,  thus  there  is  no  benefit  provided  by  the  prior  here. 

The  forth  case  is  shown  in  Figure  4.  Here,  we  have  created  a  constant  accelerating  target  with  an  an  initial  state  value 
of  Xq  =  [100,  100,  100,  50,  50,  50,  10,  10,  10].  Measurement  noise  was  added  with  R  =  100  x  I.  Meanwhile,  in 
our  batch  estimator  we  assume  only  a  nearly  constant  velocity  model  (i.e.,  a  six-state  model).  In  Figure  4,  the  plot  on  the 
left  shows  the  RMSE  and  the  covariance  trace  for  the  batch  estimator  that  assumes  no  process  noise.  The  plot  on  the  right 


Figure  1:  Plot  of  the  position  and  velocity  RMSE  for  measurement  noise  only  case. 


Figure  2:  Plot  of  the  position  and  velocity  RMSE  for  measurement  and  process  noise  case. 


is  for  the  batch  estimator  with  process  noise,  where  Q  =  100  x  I.  As  can  be  seen,  the  estimator  without  process  noise 
performs  much  worse  than  the  one  with  process  noise.  Furthermore,  the  covariance  estimate  is  highly  optimistic  (i.e., 
it  thinks  the  accuracy  is  much  better  than  it  actually  is),  while  the  square  root  of  the  covariance  trace  for  the  batch  with 
process  noise  is  comparable  to  the  RMSE.  Hence,  the  estimation  accuracy  and  the  reported  covariance  are  much  better 
when  using  the  batch  with  process  noise. 

In  summary,  single  target  simulations  were  performed  to  verify  the  correctness  of  the  batch  filter  implementation.  In 
the  case  where  both  measurement  noise  and  process  noise  were  assumed  in  the  filter  model,  the  RMSE  accuracy  agreed 
with  the  square-root  of  the  trace  of  the  covariance.  This  agreement  proved  that  the  estimator  was  consistent  and  performs 
correctly.  When  we  compared  the  performance  to  the  case  where  no  process  noise  was  assumed  in  the  batch  filter,  we 
found  that  the  estimator  accuracy  was  much  worse,  and  the  covariance  reported  was  inconsistent  with  the  accuracy  of  the 
state  estimate.  When  a  prior  state  is  included  in  the  batch  estimator,  the  information  provided  significantly  improved  the 
batch  estimate,  but  only  at  time  points  near  that  of  the  provided  prior  state  estimate.  Finally,  when  the  target  dynamics 
do  not  match  the  assumed  model  (e.g.,  the  target  is  accelerating  when  a  constant  velocity  model  is  used),  then  the  batch 
estimation  with  process  noise  performs  significantly  better  than  that  without  process  noise. 


Position  Error  Velocity  Error 


Figure  3:  Plot  of  the  position  and  velocity  RMSE  for  measurement  and  process  noise  and  prior  state  estimate  case. 


Figure  4.  Comparison  of  position  RMSE  and  covariance  trace  for  an  accelerating  target  for  batch  estimator  without  (left)  process  noise 
and  with  (right)  process  noise. 


5.  SUMMARY 


In  summary,  this  paper  has  shown  the  formulation  of  batch  MAP  and  ML  estimation  algorithms  where  process  noise  has 
been  included  in  the  model.  The  use  of  process  noise  is  important  in  tracking  applications  where  there  is  uncertainty  in 
the  target  dynamical  model.  We  first  formulated  the  batch  estimator  for  the  discrete  nonlinear  case.  The  MAP  and  ML 
formulations  were  given,  as  well  as  the  relationships  to  the  Iterated  Extended  Kalman  Filter  and  the  Iterated  Extended 
Kalman  Filter  Smoother.  We  then  formulated  the  batch  estimators  for  the  linear  discrete  problem,  and  showed  a  compact 
least  squares  formulation  that  handles  the  cases  of  with/without  a  prior  and  with/without  process  noise.  Finally,  we 
presented  simulation  results  that  demonstrated  the  value  of  the  linear  batch  estimator  with  process  noise  and  with  a  prior. 
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